Combined One Sense Disambiguation of Abbreviations

نویسندگان

  • Yaakov HaCohen-Kerner
  • Ariel Kass
  • Ariel Peretz
چکیده

A process that attempts to solve abbreviation ambiguity is presented. Various contextrelated features and statistical features have been explored. Almost all features are domain independent and language independent. The application domain is Jewish Law documents written in Hebrew. Such documents are known to be rich in ambiguous abbreviations. Various implementations of the one sense per discourse hypothesis are used, improving the features with new variants. An accuracy of 96.09% has been achieved by SVM.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Building a high-quality sense inventory for improved abbreviation disambiguation

MOTIVATION The ultimate goal of abbreviation management is to disambiguate every occurrence of an abbreviation into its expanded form (concept or sense). To collect expanded forms for abbreviations, previous studies have recognized abbreviations and their expanded forms in parenthetical expressions of bio-medical texts. However, expanded forms extracted by abbreviation recognition are mixtures ...

متن کامل

Clinical Acronym/Abbreviation Normalization using a Hybrid Approach

A unique characteristic of clinical text is the pervasive use of acronyms and abbreviations, which are often ambiguous. The ShARe/CLEF eHealth Evaluation Lab organized three shared tasks on clinical natural language processing (NLP) and information retrieval (IR) in 2013 and one of them was to normalize acronyms/abbreviations to UMLS concept unique identifiers (CUIs). This paper describes a hyb...

متن کامل

Automatic Resolution of Ambiguous Abbreviations in Biomedical Texts using Support Vector Machines and One Sense Per Discourse Hypothesis

We present an algorithm to disambiguate abbreviations in Medline abstracts using Support Vector Machines (SVM) and one sense per discourse hypothesis. In contrast to other work using SVM for natural language disambiguation which always depend on handcrafted training and testing data, the algorithm provided here automatically extracts the training and testing data through searching long form of ...

متن کامل

Disambiguation of Biomedical Abbreviations

Abbreviations are common in biomedical documents and many are ambiguous in the sense that they have several potential expansions. Identifying the correct expansion is necessary for language understanding and important for applications such as document retrieval. Identifying the correct expansion can be viewed as a Word Sense Disambiguation (WSD) problem. A WSD system that uses a variety of know...

متن کامل

Abbreviation Disambiguation: Experiments with Various Variants of the One Sense per Discourse Hypothesis

Abbreviations are widely used in many languages and disambiguation of abbreviations is critical. In this research, a structured process that attempts to solve the problem of abbreviation ambiguity is presented. Various baseline methods have been explored, including context-related methods and statistical methods. Almost all methods are domain-independent and language independent. The applicatio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008